Sharp: Efficient Loop Scheduling with Data Hazard Reduction on Multiple Pipeline Dsp Systems1

نویسندگان

  • S. Tongsima
  • C. Chantrapornchai
  • E. Sha
  • N. L. Passos
چکیده

Computation intensive DSP applications usually require parallel/pipelined processor in order to achieve specific timing requirements. Data hazards are a major obstacle against the high performance of pipelined systems. This paper presents a novel efficient loop scheduling algorithm that reduces data hazards for those DSP applications. Such an algorithm has been embedded in a tool, called SHARP, which schedules a pipelined data flow graph to multiple pipelined units, while hiding the underlying data hazards and minimizing the execution time. This paper reports significant improvement for some well-known benchmarks, showing the efficiency of the scheduling algorithm and the flexibility of the simulation tool. INTRODUCTION In order to speedup current high performance DSP systems, multiple pipelining is one of the most important strategies that should be explored. Nonetheless, it is wellknown that one of the major problems on applying the pipelining technique is the delay caused by dependencies between instructions, called hazards. Hazards prevent the next instruction in the instruction stream from being executed due to a branch operation (control hazards) or a data dependency (data hazards). Most computationintensive scientific applications, such as image processing, digital signal processing etc., contain a great number of data hazards and few or no control hazards. In this paper, we present a tool, called SHARP, which was developed to obtain a short schedule while minimizing the underlying data hazards, by exploring loop pipelining and different multiple pipeline architectures. Many computer vendors utilize a forwarding technique to reduce the number of data hazards. This process is implemented on hardware in such a way that a copy of the computed result is sent back to the input prefetch buffer of the processor. However, the larger the number of forwarding buffers, the higher cost will be imposed to the hardware. Therefore, there is a trade-off between its implementation cost and the performance gain. Furthermore, most current modern high speed computer technologies use multiple pipelined functional units (multi-pipelined) or superscalar (super)pipelined architecture, such as MIPS R8000, IBM Power2 RS/6000 and the Intel P6. Computer architects for application-oriented processors can use the tool, proposed in this paper, to determine an appropriate pipeline architecture for a given specific application. By varying the system architecture, such as a number of pipeline units, type of each unit, forwarding buffers etc., one can find a suitable pipeline architecture that balances the hardware and performance costs. 1This work was supported in part by the Royal Thai Government Scholarship, the William D. Mensch, Jr. Fellowship, and by the NSF CAREER grant MIP 95-01006. R W IF ID EX WR Adder Multiplier IF ID R EX WR W

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reducing Data Hazards on Multi - pipelined DSP

Computation intensive DSP applications usually require parallel/pipelined processors in order to meet speciic timing requirements. Data hazards are a major obstacle against the high performance of pipelined systems. This paper presents a novel eecient loop scheduling algorithm that reduces data hazards for such DSP applications. This algorithm has been embedded in a tool, called SHARP, which sc...

متن کامل

Reducing Data Hazards on Multi-pipelined DSP Architecture with Loop Scheduling

Computation intensive DSP applications usually require parallel/pipelined processors in order to meet specific timing requirements. Data hazards are a major obstacle against the high performance of pipelined systems. This paper presents a novel efficient loop scheduling algorithm that reduces data hazards for such DSP applications. This algorithm has been embedded in a tool, called SHARP, which...

متن کامل

Detailed Scheduling of Tree-like Pipeline Networks with Multiple Refineries

In the oil supply chain, the refined petroleum products are transported by various transportation modes, such as rail, road, vessel and pipeline. The latter provides one of the safest and cheapest ways to connect production areas to local markets. This paper addresses the operational scheduling of a multi-product tree-like pipeline connecting several refineries to multiple distribution centers ...

متن کامل

A novel framework for multi-rate scheduling in DSP applications

Net model for ne-grain loop scheduling. [7] S. Ha and E.A. Lee. Compile-time scheduling and assignment of data-BLOCKINow program graphs with data-dependent iteration. [12] K.K. Parhi and D.G. Messerschmitt. Statuc rate-optimal scheduling of iterative data-BLOCKINow programs via optimum unfolding. Direct synthesis of optimized DSP assembly code from signal ow block diagrams. [14] H. Printz. Auto...

متن کامل

Optimization of SAD Algorithm on VLIW DSP

SAD (Sum of Absolute Difference) algorithm is heavily used in motion estimation which is computationally highly demanding process in motion picture encoding. To enhance the performance of motion picture encoding on a VLIW processor, an efficient implementation of SAD algorithm on the VLIW processor is essential. SAD algorithm is programmed as a nested loop with a conditional branch. In VLIW pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007